On Approximate Nearest Neighbors in Non-Euclidean Spaces

نویسنده

Piotr Indyk

چکیده

The nearest neighbor search (NNS) problem is the following: Given a set of n points P = fp 1 ; : : : ; p n g in some metric space X , preprocess P so as to eeciently answer queries which require nding a point in P closest to a query point q 2 X. The approximate nearest neighbor search (c-NNS) is a relaxation of NNS which allows to return any point within c times the distance to the nearest neighbor (called c-nearest neighbor). This problem is of major and growing importance to a variety of applications. In this paper, we give an algorithm for (4dlog 1+ log 4de + 3)-NNS algorithm in l d 1 with O(dn 1+ log n) storage and O(d log n) query time. In particular, this yields the rst algorithm for O(1)-NNS for l 1 with subexponential storage. The preprocessing time is close to linear in the size of the data structure. The algorithm can be also used (after simple modiications) to output the exact nearest neighbor in time bounded by O(d logn) plus the number of (4dlog 1+ log 4de + 3)-nearest neighbors of the query point. Building on this result, we also obtain an approximation algorithm for a general class of product metrics. Finally, we show that for any c < 3 the c-NNS problem in l 1 is prov-ably hard for a version of the indexing model introduced by Hellerstein et. al. HKP97] (our upper bound can be adapted to work in this model).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantitative Analysis of Nearest-Neighbors Search in High-Dimensional Sampling-Based Motion Planning

We quantitatively analyze the performance of exact and approximate nearest-neighbors algorithms on increasingly high-dimensional problems in the context of sampling-based motion planning. We study the impact of the dimension, number of samples, distance metrics, and sampling schemes on the efficiency and accuracy of nearest-neighbors algorithms. Efficiency measures computation time and accuracy...

متن کامل

Metric-Based Shape Retrieval in Large Databases

This paper examines the problem of database organization and retrieval based on computing metric pairwise distances. A low-dimensional Euclidean approximation of a high-dimensional metric space is not efficient, while search in a high-dimensional Euclidean space suffers from the “curse of dimensionality”. Thus, techniques designed for searching metric spaces must be used. We evaluate several su...

متن کامل

Fast Approximate Nearest Neighbor Methods for Non-Euclidean Manifolds with Applications to Human Activity Analysis in Videos

Approximate Nearest Neighbor (ANN) methods such as Locality Sensitive Hashing, Semantic Hashing, and Spectral Hashing, provide computationally efficient procedures for finding objects similar to a query object in large datasets. These methods have been successfully applied to search web-scale datasets that can contain millions of images. Unfortunately, the key assumption in these procedures is ...

متن کامل

High-Dimensional Similarity Search Using Data-Sensitive Space Partitioning

Nearest neighbor search has a wide variety of applications. Unfortunately, the majority of search methods do not scale well with dimensionality. Recent efforts have been focused on finding better approximate solutions that improve the locality of data using dimensionality reduction. However, it is possible to preserve the locality of data and find exact nearest neighbors in high dimensions with...

متن کامل

Approximate nearest neighbor algorithm based on navigable small world graphs

We propose a novel approach to solving the approximate k-nearest neighbor search problem in metric spaces. The search structure is based on a navigable small world graph with vertices corresponding to the stored elements, edges to links between them, and a variation of greedy algorithm for searching. The navigable small world is created simply by keeping old Delaunay graph approximation links p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

On Approximate Nearest Neighbors in Non-Euclidean Spaces

نویسنده

چکیده

منابع مشابه

Quantitative Analysis of Nearest-Neighbors Search in High-Dimensional Sampling-Based Motion Planning

Metric-Based Shape Retrieval in Large Databases

Fast Approximate Nearest Neighbor Methods for Non-Euclidean Manifolds with Applications to Human Activity Analysis in Videos

High-Dimensional Similarity Search Using Data-Sensitive Space Partitioning

Approximate nearest neighbor algorithm based on navigable small world graphs

عنوان ژورنال:

اشتراک گذاری